A Machine Learning Approach to Enhance Scoring Performance in Docking-Based Virtual Screening Experiments: COX-1 as a Case Study
نویسندگان
چکیده
Molecular docking can be reasonably successful at reproducing X-ray poses of a ligand in the binding site of a protein. However, scoring functions are typically unsuccessful at correctly ranking ligands according to their binding affinity. Using cyclooxygenase-1 (COX-1), a particularly challenging workhorse in virtual screening (VS) we show how the use of support vector machines (SVMs), trained with the individual energy terms retrieved from docking-based VS experiments, can improve the discrimination between active and inactive compounds. Actives and inactives for COX-1 were obtained from the Directory of Useful Decoys (DUD) and docked into COX-1 with AutoDock Vina (Vina). The energy parameters of Vina’s scoring function were used to train classification models with SVM-light. The results show that Vina offers acceptable pose prediction accuracy, but its scoring function performs poorly compared to our SVM classification models. The superior performance of the trained classification models highlights the potential of using non-linear machine learning methods to identify bioactive compounds through docking-based screening.
منابع مشابه
Beware of Machine Learning-Based Scoring Functions - On the Danger of Developing Black Boxes
Training machine learning algorithms with protein-ligand descriptors has recently gained considerable attention to predict binding constants from atomic coordinates. Starting from a series of recent reports stating the advantages of this approach over empirical scoring functions, we could indeed reproduce the claimed superiority of Random Forest and Support Vector Machine-based scoring function...
متن کاملDevelopment of target-biased scoring functions for protein-ligand docking
Accurate scoring of protein-ligand interactions for docking, binding-affinity prediction and virtual screening campaigns is still challenging. Despite great efforts, the performance of existing scoring functions strongly depends on the target structure under investigation. Recent developments in the direction of target-classspecific scoring methods and machine-learning-based procedures reveal s...
متن کاملParaDockS - an open-source framework for molecular docking: implementation of target-class-specific scoring methods
Accurate scoring of protein-ligand interactions in molecular docking and virtual screening is still challenging. Despite great efforts, the performance of existing scoring functions strongly depends on the target structure under investigation. Recent developments in the direction of target-class-specific scoring methods and machine-learningbased classification models reveals a significant impro...
متن کاملVirtual screening approach to identifying influenza virus neuraminidase inhibitors using molecular docking combined with machine-learning-based scoring function
In recent years, an epidemic of the highly pathogenic avian influenza H7N9 virus has persisted in China, with a high mortality rate. To develop novel anti-influenza therapies, we have constructed a machine-learning-based scoring function (RF-NA-Score) for the effective virtual screening of lead compounds targeting the viral neuraminidase (NA) protein. RF-NA-Score is more accurate than RF-Score,...
متن کاملProtein-specific Scoring Method for Ligand Discovery
Protein-based virtual screening plays an important role in modern drug discovery process. Most protein-based virtual screening experiments are carried out with docking programs. The accuracy of a docking program highly relies on the incorporated scoring function based on various energy terms. The existing scoring functions deal all the energy terms with the equal weight function or other weight...
متن کامل